Random Number Generation in the Parallel Environment - Distributed Memory Computing Conference, 1990., Proceedings of the Fifth

نویسندگان

F. Sharp

Charles H. Still

چکیده

By randomly seeding individual processors in a parallel environment with unique random number generators it is possible to take full advantage of the economies of scale present in the parallel environment to achieve more accurate simulations. While a single random number generator is sufficient for a serial computer, the same is not true for a parallel computer. Multiple copies of the same generator do not improve the quality of the simulation as the period may be insufficient to prevent exhaustion or ’banding’ of the variates. Our approach is to provide each processor with its own unique random number generator and use a common seed value. This ensures each simulation is unique as each generator is different due to random assignment by the front-end computer. The linear congruential method was chosen due to widespread familiarity and acceptance of the technique. By using a sequence of random numbers generated on the front-end computer, prime numbers are selected from a predefined array of 2048 primes and assigned to processors. To provide maximum possible period to the generators, all 2048 primes in the array are six digits in size. This gives the researcher the ability to run simulations involving up to a million random numbers with a high degree of certainty that each processor is running a different simulation. By taking advantage of the large periods, the economies of scale available on a parallel machine can then be expolited to run large scale simulations involving millions of numbers which would be prohibitive on a serial machine. Random Numbers and Simulation Simulation has achieved a new level of importance in the modern era. Many things of interest to researchers cannot be directly viewed, experimented upon, or actually done due to prohibitive cost or danger. In these cases, simulation becomes the researcher’s main tool. Simulations of real world systems are usually mathematical models. Nuclear plants, national economies, and plane crashes are only a few of the systems which can be modeled by the use of mathematics. However, the formulas themselves are useless without some sort of input to drive them. Random numbers drive the equations and produce the results. Random numbers are usually the input for these models since the models are based, for the most part, on probabilities. As an example, a splitting atom gives off a neutron, which has a probability k of hitting another atom and forcing it to split. There is also, however, the probability l k that the neutron will not hit another atom and the reaction may die. This method of simulation is known as stochastic simulation but usually referred to as Monte Carlo simulation. Monte Carlo involves sampling from a specific distribution, usually the Uniform (0,1], as these numbers approximate probabilites. The samples are then used to estimate the value of an integral, even in cases where the integral may not be readily apparent. Random numbers must be generated somehow, as computers, where most of the simulations are now done, do not have built-in tables of random numbers. It is for this purpose that the congruential generator methods were developed. The Linear Congruential Method For the purpose of this paper, the Linear Congruential Generator method was employed. Motivating our choice of the LCG was ease of programming, debugging, and understanding. The congruential generation method is probably the most commonly used method for generating random numbers today. For a complete explanation of the method see either Kennedy and Gentle [2] or Rubinstein [4]. For those who may not be acquainted with these methods we shall provide a brief overview. Congruential methods generate pseudo-random numbers by using a recursive formula. The numbers generated are called pseudo-random since they are, per force, deterministic. Given the same formula, the same seed value will always generate the same sequence of “random” numbers. This is not truly unfortunate since it make replicating results possible, which a truly random sequence of numbers would make impossible. The general form of a congruential generator is (mod m) ci E (aci-1 + c) for i = 1,2,. . . where xi the next random in the sequence cr the multiplier c the increment m the modulus 378 186-21 1 3-~90lQQQQlQ3~ 01 .QQ Q 1990 IEEE and 0 5 xi < m. The value for 2 at i = 0 is referred to as the seed and is provided by the user. a,c, and m must all be nonnegative. The generator can have a maximal period, the interval between repeating values, of m if and only if the following conditions are met: [2] 1. c is relatively prime to m 2. a 1 3. a E 1 (mod p ) for every prime factor p of m (mod 4) if 4 is a factor of m In reality, finding a quadruple (a, c , p , m) would take too much time to be justifiable. A reasonable approximation for c can be made by c = (1/2 1/6 * A) * m which was proposed by Knuth [3]. The Current Approach Fox et. al. [l] suggest using the linear congruentia1 generator method, but adapting it to the parallel architecture of a hypercube. The approach they propose is to load each node with the same generator but have nodes “leapfrog” each other. This staggering effect is obtained by having each node step into the sequence of randoms n variates, where n is the processor’s number, starting at zero. Formally, if there are p nodes, this means that node 0 gets t o as a random va1ue;node 1 gets 21, node 2 gets 2 2 , ..., node p 1 gets ~ ~ 1 , node 0 gets xp, node 1 gets x ~ + ~ , and so on, While this accomplishes the task of generating random numbers for each node, and parallelizes the task, it carries some inherent problems. First, this method uses only one random number generator. While some randomization of the nodes is introduced by the stepping algorithm, the fact that only one sequence of numbers is being used is not overcome. This can have debilitating side effects. If the period is not large enough, the nodes might overlap, producing multiple simulations which are identical. True, this is a pathological example, but even if the period is large enough to prevent overlap, banding will occur in the randoms. By banding, we mean that the points (to, XI), (22, zs), . . . when plotted produce bands, indicating a high degree of correlation. It is possible for a simulation to exhaust the period of the random number generator since the period is reduced in size to m/n where n is the number of nodes in a simulation. In such cases, the generator would begin generating the same randoms again, due to the deterministic nature of the algorithm. Since the nodes are staggered, each node would begin to progressively exhaust the period of its generator. Nodes would then begin to rerun simulations which had already been run by other nodes thereby producing identical results. Fox’s method does provide a way of generating random numbers for parallel simulations, but it is a method open to needless redundancy of effort.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Basic Matrix Subprograms for Distributed Memory Systems - Distributed Memory Computing Conference, 1990., Proceedings of the Fifth

Parallel systems are in general complicated to utilize eficiently. As they evolve in complexity, it hence becomes increasingly more important to provide libraries and language features that can spare the users from the knowledge of low-level system details. Our effort in this direction is to develop a set of basic matrix algorithms f o r distributed memory systems such as the hypercube. The goa...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

متن کامل

Improving the palbimm scheduling algorithm for fault tolerance in cloud computing

Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...

متن کامل

A Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver

In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...

متن کامل

Green Energy-aware task scheduling using the DVFS technique in Cloud Computing

Nowdays, energy consumption as a critical issue in distributed computing systems with high performance has become so green computing tries to energy consumption, carbon footprint and CO2 emissions in high performance computing systems (HPCs) such as clusters, Grid and Cloud that a large number of parallel. Reducing energy consumption for high end computing can bring various benefits such as red...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Random Number Generation in the Parallel Environment - Distributed Memory Computing Conference, 1990., Proceedings of the Fifth

نویسندگان

چکیده

منابع مشابه

Basic Matrix Subprograms for Distributed Memory Systems - Distributed Memory Computing Conference, 1990., Proceedings of the Fifth

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

Improving the palbimm scheduling algorithm for fault tolerance in cloud computing

A Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver

Green Energy-aware task scheduling using the DVFS technique in Cloud Computing

عنوان ژورنال:

اشتراک گذاری